Towards Eliminating Random 1 / 0 in Hash Joins

نویسندگان

Ming-Ling Lo

Chinya V. Ravishankar

چکیده

The widening performance gap between CPU and disk is significant for hash join performance. Most current hash join methods try t o reduce the volume of data transferred between memory and disk. In this paper, we try to reduce hash-join times b y reducing random I/O. We study how current algorithms incur random I/O, and propose a new hash join method, Seq+, that converts much of the random 1/0 t o sequential I/O. Seq+ uses a new organization for hash buckets on disk, and larger input and output buffer sizes. We introduce the technique of batch writes t o reduce the bucket-write cost, and the concepts of writeand readgroups of hash buckets to reduce the bucket-read cost. We derive a cost model for our method, and present formulas for choosing various algorithm parameters, including input and output buffer sizes. Our performance study shows that the new hash join method performs many times better than current algorithms under various environments. Since our cost functions under-estimate the cost of current algorithms and over-estimate the cost of Seq+, the actual performance gain of Seq+ is likely t o be even greater.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Eliminating Random I/O in Hash Joins

The widening performance gap between CPU and disk is signiicant for hash join performance. Most current hash join methods try to reduce the volume of data transferred between memory and disk. In this paper , we try to reduce hash-join times by reducing random I/O. We study how current algorithms incur random I/O, and propose a new hash join method, Seq + , that converts much of the random I/O t...

متن کامل

Memory-Efficient Hash Joins

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded populatio...

متن کامل

Memory-Contention Responsive Hash Joins

In order to maximize system performance in environments with fluctuating memory contention, memory-intensive algorithms such as hash join must gracefully adapt to variations in available memory. Mixed workloads, creating fluctuations of erratic frequency and magnitude, make responsiveness to memory contention particularly important. Previous studies on adaptable hash joins have focused on lower...

متن کامل

On a Three-Way Hash Join Algorithm

We develop hash-based algorithms for computing a three-way join. The method involves hashing all three relations into buckets, and then joining buckets in main memory, three buckets at a time. Comparing to two-cascaded hash joins, the algorithms avoid materializing an intermediate result. We present a cost model for this approach, from which we identify the range of parameters for queries that ...

متن کامل

Using Optimized Multi-Attribute Hash Indexes for Hash Joins

The join operation is one of the most frequently used and expensive query processing operations in relational database systems. One method of joining two relations is to use a hash-based join algorithm. Hash-based join algorithms typically have two phases, a partitioning phase and a partition joining phase. We describe how an optimal multi-attribute hash (MAH) indexing scheme can be used to red...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Towards Eliminating Random 1 / 0 in Hash Joins

نویسندگان

چکیده

منابع مشابه

Towards Eliminating Random I/O in Hash Joins

Memory-Efficient Hash Joins

Memory-Contention Responsive Hash Joins

On a Three-Way Hash Join Algorithm

Using Optimized Multi-Attribute Hash Indexes for Hash Joins

عنوان ژورنال:

اشتراک گذاری